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In an associative learning preparation, the participants were given partial 
reinforcement (PRF) with two different cues. For one of the cues, the 
nonreinforced presentations consisted of pairings of the cue with a neutral 
outcome, whereas these presentations consisted of pairings with an aversive 
outcome for the other cue. The results showed that PRF training produced 
strong responding to the cue paired with the neutral outcome on the 
nonreinforced trials, whereas responding to the cue paired with the aversive 
outcome on the nonreinforced trials was strongly suppressed. The present 
results are problematic for current theories of learning (e.g., Rescorla & 
Wagner, 1972), but can be explained by classical theories involving 
motivational mechanisms (e.g., Konorski, 1967), as well as by a recently 
developed model, in which incompatible outcome expectations compete for 
their expression into behavior (i.e., Pineno & Matute, 2003). 


Since Pavlov (1927) performed his original studies on classical 
conditioning, it is well known that a conditioned response to a conditioned 
stimulus (CS), formed due to the repeated pairing of the CS with an 
unconditioned stimulus (US), can be attenuated through either presentations 
of the CS without the US (i.e., experimental extinction) or presentations of the 
CS with a motivationally antagonistic US (i.e., counterconditioning). The fact 
that both experimental procedures result in a decrease in the strength and/or 
frequency of the response has encouraged many theorists of learning to 
explain extinction and counterconditioning through common mechanisms. 
Pavlov (1927; see also Konorski, 1948), explained extinction as due to the 
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formation of an inhibitory CS-US association, different in nature to the 
excitatory CS-US association, and counterconditioning as due to the 
development of an excitatory association between the CS and the new US. By 
contrast, Konorski (1967) proposed that both extinction and 
counterconditioning are based on the formation of excitatory associations. 
Specifically, during extinction and counterconditioning, the representation (or 
gnostic unit, in his terminology) of the CS becomes associated with the 
representation of the noUS or the new US, respectively. According to 
Konorski, the activation of the representation of a US from a given 
motivational system (i.e., US,) is incompatible with the activation of the 
representation of the noUS, or the representation of a US from a different 
motivational system (i.e., US 2 ). In other words, these representations are 
mutually antagonistic and their activations are reciprocally inhibited (see also 
Rescorla & Solomon, 1967; Solomon & Corbit, 1974). 

This view of Konorski (1967) implies that, after pairings of a CS with 
US,, CS-noUS, trials (extinction) and CS-US 2 (counterconditioning) can be 
perceived by the animal as motivationally equivalent. For example, when US, 
and US 2 consist of food and footshock, respectively, the absence of US, (just 
like the presence of US 2 ) can produce an aversive reaction (e.g., frustration, 
Amsel, 1958), and the absence of US 2 (just like the presence of US,) can 
produce an appetitive reaction (e.g., relaxation, Denny, 1971). This functional 
equivalence of the representations of the noUS, and the US 2 regarding their 
potential to interfere with the activation of the US, does not necessarily imply 
that the noUS, and US 2 will produce a similar degree of interference. It can be 
assumed that the impact of CS-US 2 trials will be always higher than that of 
CS-noUS, trials. There are two reasons for this. First, the physical 
presentation of US 2 can be expected to be more salient than the mere absence 
of US,. Second, the presentation of CS-US 2 trials also implies the 
presentation of CS-noUS, trials, therefore allowing for learning of both CS- 
noUS, and CS-US 2 associations (Bouton, 1993). 

The explanation of extinction and counterconditioning offered by 
Konorski (1967), therefore, not only explains both phenomena according to a 
single mechanism (learning of an excitatory CS-antagonistic US association), 
but also explains why counterconditioning treatment usually shows a higher 
effectiveness than extinction treatment in the suppression of the target 
response (e.g., Gambrill, 1967; Moore, 1986). Konorski’s view had few 
precedents in the field of associative learning due to its ability to provide an 
integrated account of many different phenomena of interference between 
outcomes. For example, both extinction and counterconditioning phenomena 
are explained as effects that arise from learning of an association between the 
CS and a different US (i.e., noUS, in extinction, and US 2 in 
counterconditioning). Proactive counterconditioning (i.e., impaired responding 
during CS-US 2 trials due to previous CS-US, pairings, see e.g., Krank, 1985; 
Scavio, 1974) can be also seen as analogous to latent inhibition (i.e., impaired 
responding during CS-US, trials due to previous CS-noUS, presentations, 
see e.g., Lubow, 1973). Also, the conditioned suppression suffered by an 
appetitive instrumental response due to the presentation of an aversive CS 
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(e.g., Annau & Kamin, 1961; Bolles & Fanselow, 1980; Bouton & Bolles, 
1980; Church, 1969; Estes & Skinner, 1941) can be explained by this theory 
as functionally equivalent to the summation test of conditioned inhibition (i.e., 
decrease of responding to a CS due to the simultaneous presentation of an 
inhibitor of the US; Rescorla, 1969). More importantly, Konorski’s theory 
encouraged a great amount of research that supported many of its elegant 
predictions (see, e.g., Goodman & Fowler, 1983; Dickinson, 1977; Dickinson 
& Dearing, 1979; see Dickinson & Pearce, 1977, for a review). 

However, these features of Konorski’s (1967) theory have been largely 
ignored by traditional models of classical conditioning (e.g., Mackintosh, 
1975; Pearce & Hall, 1980; Rescorla & Wagner, 1972; Wagner, 1981). First, 
some of these models (e.g., Mackintosh, 1975; Rescorla & Wagner, 1972) do 
not acknowledge the possibility of concurrent CS-US and CS-noUS 
associations. According to these models, excitatory and inhibitory learning 
consist, respectively, on the increase or decrease of the net strength of an 
association between the representations of the CS and the US (but see 
Bouton, 1993; Pearce, 1987; Pearce & Hall, 1980; Wagner, 1981). Second, 
according to all these models, extinction and counterconditioning are 
exclusively due to the absence of the US that was previously paired with the 
CS during the original acquisition phase. This is clearly represented in the 
learning rule of the Rescorla-Wagner model: 

Ws (1) 

In this equation, AV^ S represents the change in associative strength of 
the CS on trial n. a and (3 are learning-rate parameters representing the 
salience of the CS and the US, respectively. These parameters adopt values 
between 0 and 1, as a function of their corresponding salience (in the 
Rescorla-Wagner model, the perceived physical intensity). The parenthetical 

term (i.e., A-E/ -1 ) represents the discrepancy between the amount of 
associative strength that can be supported by the US (A) and the current total 
associative strength acquired, until trial n-1, by all the CSs present on trial n 
( V "-' ). The value of A will depend on the presence or absence of the US on 
trial tv. when the US is presented, A adopts a value of 1; when the US is 
absent, A adopts a value of 0. 

Therefore, the acquisition of a conditioned response to a CS (i.e., CS- 
US trials) occurs, according to the Rescorla-Wagner model, due to a 
progressive strengthening (up to 1) of the CS-US association, based on the 
discrepancy between the expected and actual occurrence of the US (i.e., 

1 - Since this discrepancy will be smaller as the acquisition training 

proceeds, the increments of the associative strength gained by the CS will be 
also smaller, resulting in a progressively decelerated curve of acquisition. 
During extinction training (i.e., CS-noUS trials), the associative strength of 
the CS decreases (down to 0) due to the existing discrepancy between the 
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expectation of the US and its actual absence (i.e., 0 - V" 1 ). As occurred 
during acquisition, this discrepancy will be smaller as the extinction training 
proceeds, resulting in smaller negative increments of the associative strength 
and, hence, in a progressively decelerated curve of extinction. 

Importantly, in the Rescorla- Wagner (1972) model (see also 
Mackintosh, 1975; Pearce & Hall, 1980; Wagner, 1981) the value of X is 
exclusively determined by the presence or absence of its corresponding US. 
Therefore, according to the Rescorla-Wagner model, whether an expected US 
is merely absent (as occurs during extinction) or replaced by another US (as 
occurs during counterconditioning) is completely irrelevant. Thus, whereas 
Konorski (1967) viewed extinction as a kind of counterconditioning, the 
traditional models of learning (e.g., Rescorla & Wagner) contemplate 
counterconditioning a kind of extinction. As a consequence, the traditional 
models of learning (contrary to Konorski), are unable to explain the higher 
effectiveness of counterconditioning treatment than of extinction treatment in 
suppressing conditioned behavior (e.g., Gambrill, 1967; Moore, 1986). 

The fact that the suppression of behavior to a CS due to its 
counterconditioning with a different US occurs at a faster rate than the 
extinction of behavior due to the mere nonreinforcement of the CS might be 
viewed as unchallenging because in both procedures, regardless of the 
different rates, responding does decrease. However, a completely different 
outcome can be expected if the different trial types of extinction and 
counterconditioning are interspersed during training. In the case of extinction, 
interspersing the CS-US, and CS-noUS, trials would result in the typical 
partial reinforcement (PRF) procedure (e.g., Hartman & Grant, 1960), which 
is known to produce persistent responding in the face of subsequent 
extinction (Amsel, 1958). But, in the case of counterconditioning, 
interspersing the CS-US, and CS-US 2 trials would result in both a PRF 
procedure and a partial punishment procedure, which is known to yield strong 
and persistent suppression of the response (e.g., Storms & Boroczi, 1966). In 
this situation, according to the traditional models of learning (e.g., Rescorla & 
Wagner, 1972), responding to a CS, A, trained with both US, and the absence 
of US, should be similar to responding to a CS, B, trained with US, and US 2 , 
whereas according to Konorski (1967; see also Rescorla & Solomon, 1967; 
Solomon & Corbit, 1974) responding to CS B should be more strongly 
suppressed than responding to CS A. 

The present experiment was performed in order to test whether 
responding to a partially reinforced cue (i.e., analogous to the CS in the 
terminology of human associative learning) can be affected by the 
motivational value of the outcome (i.e., analogous to the US in the 
terminology of associative learning) presented on the nonreinforced trials. 
Three motivationally different outcomes were used in this experiment: an 
appetitive outcome (i.e., 0 A ), an aversive outcome (i.e., 0 Av ), and a neutral 
outcome (i.e., 0 Ne ). The motivational value of these outcomes was given 
exclusively through instructions: the participants could either gain or lose 
points by responding on those trials in which the cue was followed by 0 Ap or 
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0 Av , respectively. The number of gained or lost points on each trial positively 
correlated with the number of responses performed during the presentation of 
the cue. The participants were also instructed about the possibility of neither 
gaining nor losing points on a given trial, therefore providing a third, neutral, 
outcome (i.e., 0 Ne ). It is also important to mention that the number of points 
accrued by the participants during their performance with the task was not 
interchanged by any good after the experiment, such as money (e.g., 
O’Donnell, Crosbie, Williams, & Saunders, 2000). Therefore, the 
instructions, together with the participants’ interest in achieving a high 
performance with the task (i.e., to accrue a high number of points) provided 
the motivational value of the different outcomes used in the experiment. 

The critical question in this experiment was: is responding to a partially 
reinforced cue affected by the motivational value of the outcome presented on 
the nonreinforced trials? In order to answer this question, all participants were 
exposed to two different cues, A and B, trained in a PRF schedule: cues A and 
B were reinforced in the 50% of the trials (i.e., A-0 Ap or B-0 Ap trials) and 
nonreinforced in the other 50% of the trials. However, for cue A the 
nonreinforced trials consisted of trials in which the cue was followed by the 
neutral outcome (i.e., A-0 Ne trials), whereas for cue B the nonreinforced trials 
consisted of trials in which the cue was followed by the aversive outcome (i.e., 
cue B-0 Av trials). 


METHOD 

Participants and Apparatus. The participants were fourteen students 
(1 man and 13 women, with a mean age of 19.85 years [SEM = 0.36]) from 
Deusto University, volunteered for the study. The experiment was conducted 
using personal computers and participants were run in individual cubicles. 


Design and Procedure. The preparation used in this experiment was 
the same as that previously used by Pineno and colleagues for the study of 
associative learning with humans (e.g., Pineno, Ortega, & Matute, 2000; 
Pineno & Matute, 2000) 1 . In this preparation, participants were asked to 
imagine that they were to rescue a group of refugees by helping them escape 
from a war zone in trucks. A translation of the instructions from Spanish 
reads as follows: 


1 A demonstration version of this preparation can be downloaded from 

http://sirio.deusto.es/matute/software.htmI 

(see also http://binsweb.binsliamton.edu/~learnins/task.htm for new adaptation of this 
preparation). 
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Screen 1 

Imagine that you are a soldier for the United Nations. Your mission 
consists of rescuing a group of refugees that are hidden in a 
ramshackle building. The enemy has detected them and has sent forces 
to destroy the building... But, fortunately, they rely on your cunning to 
escape the danger zone before that happens. 

You have several trucks for rescuing the refugees, and you have to help 
them get into those trucks. There are two ways of placing people in the 
trucks: 

Pressing the space bar repeatedly, so that one person per press is 
placed in a truck. 

Maintaining the space bar pressed down, so that you will be able to 
load people very rapidly. 

If you rescue a number of persons in a given trip, they will arrive to 
their destination alive, and you will be rewarded with a point for each 
person. You must gain as many points as possible! 

Screen 2 

But... your mission will not be as simple as it seems. The enemy knows 
of your movements and could have placed deadly mines on the road. If 
the truck hits a mine, it will explode, and the passengers will die. Each 
dead passenger will count as one negative point for you. 

Fortunately, the colored lights on the SPY-RADIO will tell you about 
the state of the road. These lights can indicate that: 

The road will be free of mines. —> The occupants of the truck will be 
liberated. —> You will gain points. 

The road will be mined. —> The occupants of the truck will die. —> You 
will lose points. 

There are no mines, but the road is closed. —> The occupants of the 
truck will neither die nor be liberated. —> You will neither gain nor lose 
points: You will maintain your previous score. 

Screen 3 

At first, you will not know what each color light of the SPY-RADIO 
means. However, as you gain experience with them, you will learn to 
interpret what they mean. 

Thus, we recommend that you: 

Place more people in the truck the more certain you are that the road 
will be free of mines (keep the space bar continuously pressed down 
ONLY if you are completely sure that there are no mines, because in 
this way you will put a lot of people in the truck...). 

Introduce less people in the truck the more certain you are that the 
road is mined. 
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After these instructions, participants were shown a fourth screen that 
gave instructions about contextual changes. Although contextual changes were 
not used in the present experiment, in order to avoid making more changes 
than necessary between different experimental series conducted with the same 
preparation, we maintained the fourth instruction screen programmed in the 
task. A translation of the fourth instruction screen read as follows: 

Screen 4 

Finally, it is important to know that your mission may take place in 
several different towns. The colors on the SPY-RADIO can mean the 
same or a very different thing depending on the town in which you are. 
Thus, it is important to pay attention to the message that indicates the 
place in which you are. If you travel to another town, the message 
indicating the name of the town will change. When a change of 
destination is occurring, you will read the message ‘Traveling to 
another town’, so you will be continuously informed about such 
changes. Nevertheless, sometimes you might end up returning to the 
same town even if you have seen the message that indicates that you 
are traveling. 

Do not worry if all this looks like very complex at this point. Before we 
start, you will have the opportunity to see the location of everything 
(radio, town name, messages, scores, etc.) on the screen, and to ask 
the experimenter about anything that is unclear. 


The cues were presented in the spy-radio, which consisted of six panels 
in which colored lights could be presented. Cues A and B were blue and 
yellow lights, counterbalanced. All cues were presented for 3 s. On each trial, 
the termination of the cue coincided with the presentation of an outcome. The 
appetitive outcome (0 A ) consisted of (a) the message ‘\n \ refugees safe at 
home!!!’ (with \n\ being the number of refugees introduced in the truck 
during the cue presentation) and, (b) gaining one point for each refugee who 
was liberated. The aversive outcome (0 Av ) consisted of (a) the message ‘ \n\ 
refugees have died!!!’ and, (b) losing one point for each refugee who died in 
the truck. The neutral outcome (0 Ne ) consisted of (a) the message ‘Road 
closed’ and, (b) maintaining the previous score 2 . Outcome messages were 
presented for 3 s. During the intertrial intervals, the lights were turned off (i.e., 


-y 

~ The neutral outcome (0 Ne ) was presented (i.e., instead of presenting no outcome at all) 
in order to give the participants feedback about the consequences of their behavior, not 
only on reinforced (0 Ap ) and punished (0 Av ) trials, but also on nonreinforced (0 Ne ) trials. 
This feedback given on A-^0 Ne trials, as the feedback given on B— »0 Av trials, aimed to 
make explicit the absence of 0 Ap . Thus, if anything, the presentation of a neutral 
outcome increased the effectiveness of nonreinforced trials, while minimizing 
unnecessary differences between the A^»0 Ne and B— »0 Av trials. 
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gray). The mean intertrial intervals duration was 5 s, ranging between 3 and 7 
s. 

The number of refugees loaded in the truck during the cue was 
reported in a box on the screen, this number being immediately updated after 
each response. Although pressing the space bar during the outcome message 
had no consequences, the number of refugees loaded in the truck during the 
previous cue presentation remained visible during the presentation of the 
outcome. Upon outcome termination, the score panel was initialized to 0. 
Responses that occurred during the intertrial intervals had no consequence 
and were not reflected in the panel. 

The number of refugees that participants risked placing in the truck on 
each trial was our dependent variable. During each cue presentation, each 
response (i.e., pressing the space bar once) placed one refugee in the tmck, 
whereas holding the space bar down placed up to 30 refugees per second in 
the truck. Therefore, the number of refugees placed in the truck not only 
correlated with the number of responses (i.e., pressing the space bar), but also 
with the intensity of these responses (i.e., holding the space bar down). 
However, for reasons of simplicity, we will refer to our dependent variable as 
the number of responses. Alternatively, one could view our dependent variable 
as reflecting the participants’ expectation of the appetitive outcome (O a ). 
Presumably, the more certain the participants were that the cue would be 
followed by 0 Ap , the greater number of refugees they would place in the tmck, 
whereas the more certain participants were that the truck would explode (0 Av ) 
or that the road would be closed (0 Ne ), the fewer refugees they would place in 
the truck. 

All participants in the experiment were given 40 trials, 20 trials with 
each of cues A and B. The half of the presentations of each cue was followed 
0 Ap , and the other half was followed by a nonappetitive outcome. On these 
nonreinforced trials, cues A and B were paired with 0 Ne and 0 Av , respectively. 
Thus, both cues A and B were exposed to a PRF procedure in which 
responding was reinforced in the 50% of the trials, and responding was either 
nonreinforced (cue A) or punished (cue B) in the other 50% of the trials. The 
different trial types were presented following a pseudorandom sequence, 
which was given to the participants twice during the experiment. This 
sequence was A-*0 Ap , A-*0 Ap , B-*0 Ap , A^O Ne , A->0 Ne , B->0 Av , B->0 Av , 
A— K) Ne , B— >0 Ap , B— ^0 Av , A^O Ap , B— ^0 Av , B— *0 Ap , A— >0 Ne , B— *0 Ap , 
B^O Av , A->0 Ne , A->0 Ap , B->0 Ap , A->0 Ap . 


RESULTS 

Figure 1 depicts the results of the experiment. As can be appreciated 
from the figure, responding to both cues A and B (or, from an alternative view, 
the ratings of these cues as predictors of 0 Ap ) initially increased from Trial 1 
to Trial 2. However, after Trial 2 responding to A was stronger than 
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responding to B on most of the trials. This impression was confirmed by a 2 
(Cue: A vs. B) x 20 (Trials) ANOVA on the mean number of responses, 
which showed main effects of both cue, F( 1, 13) = 11.81, p <.01, and trials, 
F( 1 9, 247) = 2.32, p < .01, as well as a Cue x Trials interaction, Cfl, 247) = 
2.41, p < .01. Also, pairwise comparisons showed that responding to cue A 
was significantly stronger than responding to B on Trials 3, 4, 7-9, 11-14, and 
18-20, allFs(l, 13) > 5.69, p s < .05. The response elicited by A was also 
marginally stronger than that elicited by B on Trials 15 and 17 (ps = .06 and 
.07, respectively). On the rest of the trials, responding to cues A and B did not 
differ (all ps > .10). 

These results were not influenced by any differential responding during 
the pre-cue period, as showed by a 2 (Cue: A vs. B) x 20 (Trials) ANOVA on 
the number of responses during the 3-s period prior to the presentation of the 
cue, which yielded no main effect nor significant interaction (all ps > .13). 



Figure 1. Results of the experiment. Mean number of responses to cues 
A and B. Cue A was followed by either 0 Ap (i.e., an appetitive 
outcome) or 0 Ne (i.e., a neutral outcome), whereas cue B was followed 
by either 0 Ap or 0 Av (i.e., an aversive outcome). Error bars depict 
standard error of the means. 


DISCUSSION 

The results of the present experiment showed that responding to cue B 
(i.e., a cue paired with both an appetitive and an aversive outcome on different 
trials) was more strongly impaired than responding to cue A (i.e., a cue paired 
with both an appetitive and a neutral outcome on different trials). As 
previously discussed, these results cannot be explained by traditional models 
of learning (e.g., Mackintosh, 1975; Pearce & Hall, 1980; Rescorla & 
Wagner, 1972; Wagner, 1981). According to these models, responding to a 
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cue in a PRF schedule is only determined by the ratio of reinforced and 
nonreinforced trials, with total independence of the motivational quality of the 
outcome presented during the nonreinforced trials. This is showed in the 
simulation 3 of the present experiment following the Rescorla and Wagner 
model (see top panel of Figure 2). As can be seen in this simulation, this 
model predicts that both cues A and B will progressively increase their 
associative strength as training proceeds, reaching an asymptotic level of V A = 
V B = 0.4. Thus, the associative strength of cues A and B will nearly resemble 
the actual contingency of 0.5 for each cue with 0 Ap (i.e., AP, Allan, 1980), 
although slightly smaller due to the overshadowing (Pavlov, 1927) produced 
by the contextual cues (which were included in the simulation). 

Some post hoc manipulations could be performed in the parameters of 
the Rescorla and Wagner (1972) model in order to enable this model to 
explain the present results. For example, in those trials in which 0 Ap is not 
present, its salience (i.e., P noOAp ) could be assumed to be higher due to the 
presentation of 0 Av , in comparison to its salience when 0 Ne is presented. Or, 
alternatively, when 0 Av is presented, the total amount of associative strength 
supported by 0 Ap (i.e., X) could adopt a negative value (e.g., -1) instead of a 
null value. Finally, the strength of the appetitive response could be viewed as 
reflecting the difference between the associative strengths that the cue acquired 
with 0 Ap and 0 Av (i.e., R = V Ap - V Av , see Krank, 1985). However, none of 
these manipulations are free from theoretical problems. First, in the Rescorla 
and Wagner model an absent cue has a null salience (i.e., a cs = 0), whereas an 
absent outcome has a positive salience (i.e., |3 noOA > 0). This assumption, 
which is necessary in order to allow this model to explain learning in the 
absence of the outcome (e.g., extinction), implies an asymmetrical processing 
of the cues and outcomes (but see Wagner, 1981). Although the latter 
assumption can be acceptable, it is harder to see how this model could assume 
that the value of |3 noOA can be greater due to the presentation of an aversive 
outcome (0 Av ) than when a neutral outcome (0 Ne ) is presented. Second, 
assuming that X of 0 Ap adopts a value of 0 and -1 during the presentations of 
0 Ne or 0 Av , respectively, implies that, whereas the extinction procedure would 
merely result in a loss of the previously acquired positive associative strength 
(i.e., down to zero), counterconditioning training would result in the learning 
of an inhibitory association (i.e., an associative strength of -1). This problem 
also applies to the view of the appetitive response as reflecting the difference 
between V Ap and V Av . In this case, counterconditioning would be also expected 
to yield a net negative or inhibitory appetitive response (i.e., R = V A - V A = 0 
- 1 ). 


3 The parameters used in this simulation were: a A = a A = 1, a CTX =0.1, p 0A[) = 1, P noOAp = 
0.35, X = 1. This simulation was performed using the program developed by Jason M. 
Tangen. This software can be downloaded from 

http://univmail.mcmaster.ca/~tansenim/main.html 
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Trials 



Trials 


Figure 2. Simulation of the results of the experiment by the Rescorla 
and Wagner (1972) model (top panel) and IMAL (Pineno & Matute, 
2003, bottom panel). 
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These results support the predictions of Konorski’s (1967) theory (see 
also Rescorla & Solomon, 1967; Solomon & Corbit, 1974), which states that 
the expression of an association between a cue and an appetitive outcome (i.e., 
cue-0 A ) is more strongly impaired by training the cue with a motivationally 
incompatible outcome (i.e., 0 Av ) than with a neutral outcome (i.e., 0 Ne ). 
However, since no real appetitive and aversive outcomes were presented in the 
present experiment (i.e., the outcomes were endowed with motivational value 
through instructions), in order for Konorski’s theory to explain the present 
results, it should assume that the instructions enabled the presentation and 
expectation of the outcomes to activate antagonist central emotional systems. 
However, this assumption is questionable since one of the most prominent 
features of this kind of preparations for the study of human learning is 
precisely their use of stimuli of low (or even null) biological significance (see 
Miller & Matute, 1996). 

The results of this experiment can be straightforwardly explained by 
Pineno and Matute’s (2003) integrative model of associative learning 
(IMAL). According to this model, a cue can become associated with the 
representation of 0 Ap , as well as with the representations of 0 Av and 0 Ne . 
According to IMAL, the presentation of A-0 Ne and B-0 Av trials will result in 
the formation of, not only A-0 Ap and B-0 Ap inhibitory associations 
(Konorski, 1948), but also A-0 Ne and B-0 Av excitatory associations 
(Konorski, 1967). Thus, the presentation each cue will elicit the simultaneous 
expectation of incompatible outcomes (i.e., 0 Ap and 0 Ne , in the case of cue A, 
and 0 Ap and 0 Av in the case of cue B). As a consequence, the expression of 
the target outcome (0 A ) will be impaired, not only by the learning of an 
inhibitory association between each cue and 0 Ap , but also by the expectation 
of the alternative and incompatible outcome. Moreover, in the framework of 
IMAL, learning of the B-0 Av association will proceed faster than learning of 
the A-0 Ne association due to the high salience of 0 Av compared to that of 0 Ne 
(i.e., because 0 Av is motivationally more relevant than 0 Ne ). Thus, from the 
initial trials of training, the expression of the 0 Ap expectation will be more 
strongly impaired in the presence of B than in the presence of A, due to the 
interference caused by the expectation of 0 Av (produced by the B-0 Av 
association) being stronger than that caused by the expectation of 0 Ne 
(produced by the A-0 Ne association). This is depicted in the bottom panel of 
Figure 2, which shows the simulation 4 of the present experiment by IMAL. 

There are two important points that were not addressed by the 
experiment and that deserve consideration. First, based on the previous 
explanation of the present results by Pineno and Matute’s (2003) IMAL, it is 
evident that this model predicts that, given a number enough of A-0 Ne trials, 
this association should completely interfere with the expression of the A-0 Ap 
association (as the B-0 Av association did with fewer trials). In other words, 


4 The parameters used in this simulation were the predefined parameters for simulations of 
experiments with human participants in the simulation program of Pineno and Matute’s 
(2003) model. This program can be downloaded from 
http://bineweb.binshamton.edu/~learninB/modeI.htm 
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this model predicts that, as PRF training proceeds, interference caused by the 
expectation of 0 Ne will become more pervasive. Although this prediction 
apparently contradicts the observation of response persistence in PRF 
procedures (Amsel, 1958), it receives some indirect support from experiments 
showing that extinction occurs more rapidly following extensive PRF than 
following PRF with a moderate number of training trials (see McCain, Lee, & 
Powell, 1962). However, the small number of trials in the present experiment 
did not allow us to directly test this prediction. Future experimental work 
should try to assess whether, as IMAL predicts, extensive PRF training with a 
neutral outcome produces, in the long run, the same effect produced by PRF 
training with an alternative aversive outcome in few trials. 

Finally, it is important to acknowledge the potential influence of the use 
of a within-subject design in the present experiment. Because all participants 
received training with both cues A and B, responding to each cue could be (at 
least partially) determined, not only by the outcomes directly paired with the 
cue itself, but also by the outcomes paired with the alternative cue. That is, 
responding to cue A could depend not only on the interaction between the 
expectations of 0 Ap and 0 Ne , but also on the expectation of the absence of 0 Av 
(i.e., due to 0 Av being always presented in the absence of cue A). 
Symmetrically, responding to cue B could depend not only on the interaction 
between the expectations of 0 A and 0 Av , but also on the expectation of the 
absence of 0 Ne (i.e., due to 0 Ne being always presented in the absence of cue 
B). If cues A and B were learned as inhibitors of 0 Av and 0 Ne , respectively, 
then cue A (but not cue B) would become a signal for safety and, as a 
consequence, responding to cue A would result more strongly enhanced than 
responding to cue B. Moreover, as the presentations of B— >0 Av trials could 
have increased responding to cue A, the presentations of A-*0 Ap trials could 
have enhanced the suppression of responding to cue B. This latter possibility 
is suggested by studies showing that the availability of an alternative source of 
reinforcement (i.e., responding to cue A, in the present experiment) increases 
the effectiveness of punishment treatments (e.g., Herman & Azrin, 1964). In 
sum, the use of a within-subject design in the present experiment could have 
determined to some extent the observed differential responding to the cues. 
However, according to Pineno and Matute’s (2003) IMAL, although weak 
inhibitory A-0 Av and B-0 Ne associations could have been formed in the 
present experiment, responding to cues A and B mainly depended on the 
interaction between the expectations of the outcomes directly paired with each 
cue. Therefore, this model predicts that the present results should be replicable 
using a between-group design in which partial reinforcement and partial 
punishment treatments are given in different conditions. This prediction of 
IMAL also deserves, in our opinion, future empirical work. 
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RESUMEN 

Efectos Diferenciales de la Ausencia de Reforzamiento y el 
Castigo en Humanos. En una preparacion de aprendizaje asociativo, los 
participantes recibieron reforzamiento parcial (RP) con dos claves diferentes. 
Para una de las claves, las presentaciones no reforzadas consistieron en 
emparejamientos de la clave con una consecuencia neutra, mientras que estas 
presentaciones consistieron en emparejamientos con una consecuencia 
aversiva para la otra clave. Los resultados mostraron que el entrenamiento ds 
RP produjo una fuerte respuesta ante la clave emparejada con la consecuencia 
neutra en los ensayos no reforzados. Sin embargo, la respuesta ante la clave 
emparejada con la consecuencia aversiva en los ensayos no reforzados resulto 
fuertemente suprimida. Los presentes resultados son problematicos para las 
teorfas actuales del aprendizaje (p. ej., Rescorla y Wagner, 1972), pero 
pueden ser explicados por teorfas clasicas que incluyen mecanismos 
motivacionales (p. ej., Konorski, 1967), asf como por un modelo 
recientemente desarrollado, en el cual las expectativas de consecuencias 
incompatibles compiten por su expresion en la conducta (i.e., Pineno & 
Matute, 2003). 
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